7 research outputs found
A Multi-disciplinary Approach to Interactive Information Retrieval upon Semi-structured Data Sets
The so called logic and probabilistic views on IR can be reconciled by a unifying framework for IIR. I present a proposal for a PhD research according to a multidisciplinary perspective and I discuss some of its consequences for IR as a discipline
Prior Information and the Determination of Event Spaces in Probabilistic Information Retrieval Models
A mismatch between different event spaces has been used to argue against rank equivalence of classic probabilistic models of information retrieval and language models. We question the effectiveness of this strategy and we argue that a convincing solution should be sought in a correct procedure to design adequate priors for probabilistic reasoning. Acknowledging our solution of the event space issue invites to rethink the relation between probabilistic models, statistics and logic in the context of IR
Search for journalists: New York Times challenge report
We investigate how a user-centred design to search can improve the support of user tasks specific to journalism.
Illustrated by example information needs, sampled from our own exploration of the New York Times annotated corpus, we demonstrate how domain specific notions rooted in a field theory of journalism can be transformed into effective search strategies. We present a method for search-context aware classification of authorities, witnesses, reporters and columnists.
A first search strategy supports the journalistic task of investigating the trustworthiness of a news source, whereas the second search strategy supports assessments of the objectivity of an author.
In principle, these strategies can exploit the semantic annotations the corpus; however, based on our preliminary work with the corpus, we conclude that straightforward full-text search is still a crucial component of any effective search strategy, as only recent articles are annotated, and annotations are far from complete
Implicit relevance feedback from a multi-step search process: a use of query-logs
We evaluate the use of clickthrough information as implicit relevance feedback in sessions. We employ records of user interactions with a search system for pictures retrieval: issued queries, clicked images, and purchased content; we investigate whether and how much of the past search history should be used in a feedback loop. We also assess the benefit of using clicked data as positive tokens of relevance to the task of estimating the probability of an image to be purchased
CWI at TREC 2012, KBA track and Session Track
We participated in two tracks: Knowledge Base Acceleration (KBA)
Track and Session Track. In the KBA track, we focused on experi-
menting with different approaches as it is the first time the track is
launched. We experimented with supervised and unsupervised re-
trieval models. Our supervised approach models include language
models and a string-learning system. Our unsupervised approaches
include using: 1)DBpedia labels and 2) Google-Cross-Lingual Dic-
tionary (GCLD). While the approach that uses GCLD targets the
central and relvant bins, all the rest target the central bin. The
GCLD and the string-learning system have outperformed the oth-
ers in their respective targeted bins. The goal of the Session track
submission is to evaluate whether and how a logic framework for
representing user interactions with an IR system can be used for
improving the approximation of the relevant term distribution that
another system that is supposed to have access to the session infor-
mation will then calculate.
the documents in the stream corpora. Three out of the seven runs
used a Hadoop cluster provide by Sara.nl to process the stream cor-
pora. The other 4 runs used a federated access to the same corpora
distributed among 7 workstations
Adapting Query Expansion to Search Proficiency
We argue that query expansion (QE) based on the full ses-
sion improves the overall search experience provided that we know how
to adapt the QE weighting schema to a user's search proficiency. We
propose a strategy to predict search ability from session parameters. Us-
ing an exponential model and these metrics we set user dependent QE
coefficients. We evaluate this approach on TREC 2011 session track data